Automatic discrimination between laughter and speech

نویسندگان

  • Khiet P. Truong
  • David A. van Leeuwen
چکیده

Emotions can be recognized by audible paralinguistic cues in speech. By detecting these paralinguistic cues that can consist of laughter, a trembling voice, coughs, changes in the intonation contour etc., information about the speaker’s state and emotion can be revealed. This paper describes the development of a gender-independent laugh detector with the aim to enable automatic emotion recognition. Different types of features (spectral, prosodic) for laughter detection were investigated using different classification techniques (Gaussian Mixture Models, Support Vector Machines, Multi Layer Perceptron) often used in language and speaker recognition. Classification experiments were carried out with short pre-segmented speech and laughter segments extracted from the ICSI Meeting Recorder Corpus (with a mean duration of approximately 2 s). Equal error rates of around 3% were obtained when tested on speaker-independent speech data. We found that a fusion between classifiers based on Gaussian Mixture Models and classifiers based on Support Vector Machines increases discriminative power. We also found that a fusion between classifiers that use spectral features and classifiers that use prosodic information usually increases the performance for discrimination between laughter and speech. Our acoustic measurements showed differences between laughter and speech in mean pitch and in the ratio of the durations of unvoiced to voiced portions, which indicate that these prosodic features are indeed useful for discrimination between laughter and speech. 2007 Published by Elsevier B.V.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Laughter Detection in Noisy Settings

Spontaneous human speech contains a lot of sounds that are not proper speech, yet carry meaning, laughter being a good example. Recognizing such sounds from speech-sounds could improve speech recognition systems as well as widen the communicative range of automatic dialogue systems. Our goal is to develop methods for automatic classification non-speech vocal sounds. As laughter varies widely be...

متن کامل

Use of Vowels in Discriminating Speech-Laugh from Laughter and Neutral Speech

In natural conversations, significant part of laughter co-occurs with speech which is referred to as speech-laugh. Hence, speech-laugh will have characteristics of both laughter and neutral speech. But it is not clearly evident how acoustic properties of neutral speech are influenced by its co-occurring laughter. The objective of this study is to analyze the acoustic variations between vowel re...

متن کامل

Laughter Classification Using Deep Rectifier Neural Networks with a Minimal Feature Subset

Laughter is one of the most important paralinguistic events, and it has specific roles in human conversation. The automatic detection of laughter occurrences in human speech can aid automatic speech recognition systems as well as some paralinguistic tasks such as emotion detection. In this study we apply Deep Neural Networks (DNN) for laughter detection, as this technology is nowadays considere...

متن کامل

Analysis of the occurrence of laughter in meetings

Automatic speech understanding in natural multiparty conversation settings stands to gain from parsing not only verbal but also non-verbal vocal communicative behaviors. In this work, we study the most frequently annotated non-verbal behavior, laughter, whose detection has clear implications for speech understanding tasks, and for the automatic recognition of affect in particular. To complement...

متن کامل

Laughter detection using ALISP-based N-Gram models

Laughter is a very complex behavior that communicates a wide range of messages with different meanings. It is highly dependent on social and interpersonal attributes. Most of the previous works (e.g. [1, 2]) on automatic laughter detection from audio uses frame-level acoustic features as parameters to train their machine learning techniques, such as Gaussian Mixture Models (GMMs), Support Vecto...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Speech Communication

دوره 49  شماره 

صفحات  -

تاریخ انتشار 2007